Back

Human Genetics and Genomics Advances

Elsevier BV

Preprints posted in the last 90 days, ranked by how well they match Human Genetics and Genomics Advances's content profile, based on 70 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.

1
Improving GWAS performance in underrepresented groups by appropriate modeling of genetics, environment, and sociocultural factors

Cataldo-Ramirez, C.; Lin, M.; McMahon, A.; Gignoux, C.; Weaver, T. D.; Henn, B. M.

2026-04-08 genetics 10.1101/2024.10.28.620716 medRxiv
Top 0.1%
16.7%
Show abstract

Genome-wide association studies (GWAS) and polygenic score (PGS) development are typically constrained by the data available in biobank repositories in which European cohorts are vastly overrepresented. Here, we increase the utility of non-European participant data within the UK Biobank (UKB) by characterizing the genetic affinities of UKB participants who self-identify as Bangladeshi, Indian, Pakistani, "White and Asian" (WA), and "Any Other Asian" (AOA), towards creating a more robust South Asian sample size for future genetic analyses. We assess the relationships between genetic structure and self-selected ethnic identities and use consistent patterns of clustering in the dataset to train a support vector machine (SVM). The SVM was utilized to reassign n = 1,853 AOA and WA participants at the subcontinental level, and increase the sample size of the UKB South Asian group by 1,381 additional participants. We further leverage these samples to assess GWAS performance and PGS development. We include environmental covariates in the height GWAS by implementing a rigorous covariate selection procedure, and compare the outputs of two GWAS models: GWASnull and GWASenv. We show that PGS performance derived from both GWAS models yield comparable prediction to PGS models developed with an order of magnitude larger training, and environmentally-adjusted PGS models reduce the sex-bias in predictive performance. In summary, we demonstrate how GWAS performance can be improved by leveraging ambiguous ethnicity codes, ancestry matched imputation panels, and including environmental covariates.

2
Using HiFi Long-Read Whole Genome Sequencing To Enhance Diagnosis In Patients With Subfertility And/Or Recurrent Pregnancy Loss

Teo, J. X.; Cheawsamoot, C.; Kim, D.; Goh, J. C.-Y.; Kam, S.; Chan, S. S.-M.; Yang, L.; Liu, S.; Chua, K. P.; Cheng, W.; Ma, G.-C.; Chang, T.-Y.; Lin, Y.-S.; Wu, K.-M.; Yu, E. J.; Kim, Y.; Seong, M.-W.; Thuwanut, P.; Tuntiviriyapun, P.; Suebthawinkul, C.; Srichomthong, C.; Chetruengchai, W.; Kanlayaprasit, S.; Wongong, R.; Korlach, J.; Lee, J.-S.; Chen, M.; Hwang, S.; Lim, W. K.; Shotelersuk, V.; Jamuar, S. S.

2026-05-08 sexual and reproductive health 10.64898/2026.05.01.26352136 medRxiv
Top 0.1%
14.1%
Show abstract

Subfertility and recurrent pregnancy loss (RPL) affect a significant proportion of couples worldwide. Genetic causes can be seen in up to 30% of these individuals but require multiple genetic tests, which often impede a comprehensive work up. Newer genomic technologies, such as PacBio HiFi long read sequencing (LRS) can detect most subclasses of variations (such as structural rearrangement, monogenic disorders) through one single test. In this multicenter study, we enrolled couples with unexplained subfertility and/or RPL and performed HiFi LRS to determine the underlying genetic etiology. Participants were recruited using a standardized inclusion/ exclusion criteria to rule out other known causes of subfertility and/or RPL. 96 individuals were recruited across the 5 sites. Average age of participants was 36 years (range 30-46 years). Among the 84 individuals who completed sequencing, 4.8% were identified with a likely genetic diagnosis and variants of uncertain significance were identified in another 14.2% of individuals. One individual was identified with an ACMG secondary finding, and while multiple carriers for recessive genetic disorders were identified, none of the couples were identified to be at increased risk. This study highlights the utility of performing genomic sequencing in couples with unexplained subfertility and/or RPL, with 1 in 10 couples harboring a clinically significant variant. In addition, use of HiFi LRS allowed for characterization of different subclasses of genomic variations through a single test. Future studies, including exploring the cost effectiveness and resource utilization of LRS as first line test, will help in optimizing care for such couples. TWEETABLE STATEMENTA single long-read genome sequencing test can consolidate multiple genetic investigations and uncover clinically relevant causes in couples with unexplained subfertility and recurrent pregnancy loss. AT A GLANCEO_LIWhy was this study conducted? O_LIMany couples with subfertility and recurrent pregnancy loss remain undiagnosed after multiple conventional genetic tests C_LIO_LIExisting workflows require sequential testing and may miss complex genomic variants C_LI C_LIO_LIWhat are the key findings? O_LILong-read genome sequencing identified clinically relevant variants in [~]1 in 10 couples with unexplained subfertility or recurrent pregnancy loss C_LIO_LIA single assay enabled detection of multiple variant types, including structural and sequence variants C_LI C_LIO_LIWhat does this study add to what is already known? O_LIDemonstrates feasibility of a unified genomic testing approach in a real-world multicenter cohort C_LIO_LISupports a potential shift from fragmented testing toward a single comprehensive genomic workflow C_LI C_LI

3
Inactivating PLEKHA6 Mutations Cause Idiopathic Hypogonadotropic Hypogonadism Through Impaired Kisspeptin Secretion

Topaloglu, A. K.; Plummer, L.; Su, C.-W.; Kotan, L. D.; Celmeli, G.; Simsek, E.; Zhao, Y.; Stamou, M.; Anik, A.; Döger, E.; Altıncık, S. A.; Mengen, E.; Koc, A. F.; Akkus, G.; Balasubramanian, R.; Turan, I.; Seminara, S. B.; Yuksel, B.

2026-04-13 pediatrics 10.64898/2026.04.10.26349358 medRxiv
Top 0.1%
12.8%
Show abstract

PurposeIdiopathic hypogonadotropic hypogonadism (IHH) is characterized by impaired reproductive maturation, and approximately half of all cases lack an identified genetic cause. We investigated the genetic basis of IHH in two large cohorts to identify novel disease-causing genes. MethodsWe analyzed exome and genome sequencing data from 1,822 patients with IHH from two independent cohorts. Rare variants were filtered using pedigree-informed inheritance models. PLEKHA6 expression in the postmortem human hypothalamus were tested at the mRNA and protein level. Functional studies assessed kisspeptin secretion in cell-based assays. ResultsWe identified 18 distinct PLEKHA6 variants in 24 patients from 20 unrelated families (1.3% of cohort). Variants segregated with disease under autosomal recessive and autosomal dominant (with variable penetrance) inheritance patterns. PLEKHA6 was robustly expressed in the hypothalamus and showed clear colocalization with neurokinin B, which served as the marker for the GnRH pulse generator. Functional studies demonstrated that patient variants significantly impaired kisspeptin secretion. ConclusionPLEKHA6 is a novel IHH gene and the first reported regulator of kisspeptin secretion from the kisspeptin-neurokinin B-dynorphin (KNDy) neurons, which have recently been established as the GnRH pulse generator. These findings establish impaired kisspeptin release as a new disease mechanism in IHH and highlight the critical role of neuropeptide trafficking in reproductive function.

4
Colocalization and discordance between plasma and brain protein quantitative trait loci

Cheng, Y.; Zhang, W.; Lu, T.

2026-05-05 genetics 10.64898/2026.05.01.722237 medRxiv
Top 0.1%
7.1%
Show abstract

Studies of protein quantitative trait loci (pQTLs) provide opportunities to interpret complex trait genetics and identify potential biomarkers and therapeutic targets. Circulating proteins are commonly used in pQTL studies due to the accessibility of blood-based measurements, but their levels may not always reflect regulation in disease-relevant tissues. We assessed colocalization and discordance between plasma and dorsal prefrontal cortex cis-pQTLs using data from four large-scale studies and investigated their implications for downstream analyses. Across the proteins examined, at most 80% of the cis-pQTLs showed evidence of colocalization. Among the colocalized loci, approximately 20% exhibited opposite directions of genetic effects. We characterized tissue-specific gene expression profiles based on data from the Genotype-Tissue Expression project. Proteins with colocalized cis-pQTLs were more likely to have high gene expression levels in systemic tissues and immune cells, whereas the remaining proteins were more likely to have high expression in brain tissues. We conducted Mendelian randomization (MR) analyses using neuroticism as an illustrative outcome to compare effect estimates derived using instruments from different pQTL studies. MR analyses identified 13 proteins significantly associated with neuroticism, including six with opposite effect directions between plasma and dorsal prefrontal cortex, highlighting the importance of tissue context. Overall, circulating pQTLs remain informative for proteins from systemic and immune pathways, while incorporating tissue-specific data may provide additional insight for proteins with more localized expression. Considering multiple tissue contexts may refine the interpretation of protein-trait associations and may improve the prioritization of candidate targets.

5
Meta-analysis of over 8,000 individuals from Hawai'i and Samoa for genetic associations to cardiometabolic phenotypes

Dinh, B. L.; Wang, X.; Sheng, X.; Wan, P.; Srivastava, A. K.; Naseri, T.; Viali, S.; Wilkens, L.; Le Marchand, L.; Haiman, C. A.; Weeks, D.; Chiang, C. W. K.; Carlson, J. C.

2026-05-12 genetic and genomic medicine 10.64898/2026.05.08.26352761 medRxiv
Top 0.1%
6.5%
Show abstract

Although genome-wide association studies (GWAS) now routinely reveal genetic associations and biological insights in millions of individuals, underrepresentation of global populations, such as those from Polynesia, continue to persist. These exclusions, often driven by logistical challenges and lack of data, prevent systematic identification of population-enriched associations, such as the association of the missense variant at the CREBRF locus to BMI and type 2 diabetes discovered commonly occurring in Polynesian populations due to its rarity in global populations. Armed with the recently updated TOPMed imputation panel that could benefit studies in diverse populations that previously had poorer imputation performance, we performed the first GWAS of Native Hawaiians and largest to date of Polynesian-ancestry populations (combined N up to 8,461) to identify population-enriched associations for 13 adiposity and cardiometabolic traits available across both cohorts: BMI, fasting glucose, fasting insulin, HDL, height, hip circumference, HOMA-IR, LDL, T2D, total cholesterol, triglycerides, waist circumference, and waist-hip ratio. We found 25 trait-loci associations that met genome-wide significance: 20 previously reported or known associations and 5 associations newly confirmed via meta-analysis. In particular, with improved statistical power, we were able to confirm the suspected association between the missense CREBRF variant with fasting glucose levels. The remaining 4 potentially novel loci-trait associations for BMI, LDL, and waist-hip ratio, however, were not replicated in multi-ethnic datasets from All-of-Us despite having reasonable power to replicate. The lack of Polynesian-enriched findings outside of the CREBRF locus informs the bounds of the effect sizes or frequency of any enriched variants, and suggests that further expansion of cohort sizes from this region of the world and improved imputation references specific to these populations are needed to identify more population-enriched associations.

6
Anti-inflammatory and pro-proliferative effects of fasudil in human trisomy 21 neural progenitor cells

Baxter, L. L.; Lee, S.; Fuentes, K.; Mosley, I.; Raymond, J.; Guedj, F.; Slonim, D.; Zhou, D.; Glotfelty, E.; Tweedie, D.; Grieg, N.; Bianchi, D.

2026-03-20 pharmacology and toxicology 10.64898/2026.03.19.712922 medRxiv
Top 0.1%
6.4%
Show abstract

Down syndrome (DS) results from trisomy for human chromosome 21 and is the most frequent genetic cause of intellectual disability. No effective treatments currently exist that improve neurodevelopment and cognition. Atypical brain development in individuals with DS is apparent before birth, which suggests that the optimal time to begin administration of therapies is prenatally. Human neural progenitor cell (NPC) cultures provide a tractable in vitro model system to examine the effects of trisomy 21 (T21) on neurodevelopment and to measure the effects of pharmacological interventions. Here we report the results of preclinical studies evaluating 24 candidate therapies. RNA-Seq analyses found that euploid and T21 NPCs showed different transcriptomic responses to five candidate pharmacotherapies. The Rho-associated coiled-coil kinase (ROCK) inhibitor fasudil increased proliferation of T21 NPCs, reduced expression of inflammatory pathway genes in T21 NPCs, and reduced markers of inflammation in LPS-stimulated microglia model systems. These results demonstrate that fasudil can alter multiple T21-associated abnormalities in a beneficial manner, suggesting that fasudil warrants further study as a candidate prenatal pharmacotherapy for DS.

7
Proteogenomic analysis of 5,411 plasma proteins in sickle cell disease patients

Groza, C.; Chignon, A.; Lo, K. S.; Bellegarde, V.; Bartolucci, P.; Lettre, G.

2026-04-07 genetic and genomic medicine 10.64898/2026.04.06.26350255 medRxiv
Top 0.1%
6.4%
Show abstract

There are few therapeutic options to treat patients with sickle cell disease (SCD), a blood disorder caused by mutations in the {beta}-globin gene that affects >7M individuals worldwide. Combining human genetics and high-throughput proteomics can help identify new drug targets. Here, we present results from a proteogenomic analysis of the plasma proteome in SCD patients. We measured the levels of 5,411 plasma proteins and tested their associations with common genetic variation in 343 SCD patients. After conditional analyses, we identified 560 protein quantitative trait loci (pQTL), including 58 (10%) that are novel. Many of these pQTL are not specific to SCD patients and associate with clinically relevant traits in non-SCD African Americans from the Million Veteran Program (e.g. hemoglobin concentration, triglycerides). The effect sizes of the pQTL is largely concordant between SCD and non-SCD individuals, although we found examples (e.g. APOL1, haptoglobin) with evidence of heterogeneity that suggests an interaction between the plasma proteome and the SCD genotype. Finally, we combine pQTL and genome-wide association study results for fetal hemoglobin (HbF) in a Mendelian randomization analysis to prioritize five proteins that may increase HbF production (ENPP5, LBP, NAAA, PT3X, ZP3).

8
Assessing the clinical significance of a novel rare variant in Loeys-Dietz Syndrome by combining AI-driven modelling and cell biology

Boukrout, N.; Delage, C.; Comptdaer, T.; Arondal, W.; Jemel, A.; Azabou, N.; Bousnina, M.; Mallouki, M.; Sabaouni, N.; Arbi, R.; Kchaou, S.; Ammar, H.; Hantous-Zannad, S.; Jilani, H.; Elaribi, Y.; Benjemaa, L.; Van der Hauwaert, C.; Larrue, R.; CHEOK, M.; Perrais, M.; Lefebvre, B.; Cauffiez, C.; Pottier, N.

2026-03-31 genetic and genomic medicine 10.64898/2026.03.30.26349510 medRxiv
Top 0.1%
6.2%
Show abstract

Loeys-Dietz syndrome (LDS) is an autosomal dominant connective-tissue disorder caused by genetic variants in TGF-{beta} pathway genes, most often TGFBR1/2. While pathogenic TGFBR2 genetic mutations usually cluster in the kinase domain and disrupt SMAD signalling, distinguishing with confidence those with functional impact on TGFBR2 function from rare benign genetic alterations represents one of the most important ongoing challenges for accurate genetic testing. Therefore, there is a pressing need to develop methods that can improve functional variant interpretation. Here, we describe and characterize the functional impact of a novel genetic variant in the TGFBR2 kinase domain (E431K), in a patient with the clinical diagnosis of syndromic genetic aortopathy. We assessed the structural and functional consequences of this variant using AI-driven molecular modelling and in vitro cell-based assays. A high-quality homology-based model of TGFBR2 was generated and computational mutagenesis based on the structural context and evolutionary conservation was used to forecast variant pathogenicity. Relative to wild type, the variant affects protein stability by disrupting intramolecular interactions and likely induces conformational changes that may affect kinase activity and thus TGF-{beta} signalling. This was experimentally confirmed by showing abnormal protein level and alteration of canonical TGF-{beta} pathway activation. Overall, our results establish that the E431K variant leads to aberrant TGF-{beta} signalling and confirm the diagnosis of Loeys-Dietz syndrome type 2 in this patient.

9
Otenabant is a Selective Antagonist of Human PIEZO1

Jaquet, V.; Penttinen, R.; Rodriguez, G.; Castelbou, C.; Cambet, Y.; Rosa, N.; Bourdin, M.; Asghariastanehei, B.; de Lima, F.; Martin, M.; Nader, E.; Serra, S.; Fernandez-Fernandez, J.; Murciano, N.; Connes, P.; Egee, S.; Guizouarn, H.; Kaestner, L.; Fertig, N.; Rotordam, M. G.; Demaurex, N.

2026-04-30 pharmacology and toxicology 10.64898/2026.04.28.720360 medRxiv
Top 0.1%
4.4%
Show abstract

Background and purposePIEZO1 mechanosensitive cation channels translate mechanical cues into intracellular Ca2+ and Na+ elevations, enabling cells to respond to physical alterations in their environment. PIEZO1 contributes to red blood cells (RBC) volume homeostasis and gain-of-function PIEZO1 mutations cause hereditary xerocytosis (HX), a rare mostly compensated hemolytic anemia, and aberrant channel activation exacerbates sickling and vascular dysfunction in sickle cell disease. Despite strong genetic and physiological evidence supporting PIEZO1 as a therapeutic target, potent and selective inhibitors are limited, and existing compounds show modest specificity or poorly explored mechanisms. Improved pharmacological tools are needed. Experimental approachWe conducted a high-throughput screen of FDA-approved drugs to identify PIEZO1 inhibitors. Compounds were tested at concentrations of 10 {micro}M in a monocytic cell line, using intracellular Ca2+ elevations evoked by the PIEZO1 agonist Yoda1 as read-out. The inhibitory activity of the best hit was validated and compared to existing PIEZO1 inhibitors using electrophysiological analysis, orthogonal PIEZO1-dependent assays across cell lines and human RBCs. As functional proof, we investigated the impact of three PIEZO1 inhibitors on RBC deformability by ektacytometry, after Yoda1 pre-stimulation. Key resultsThis screen identified Otenabant, a selective Cannabinoid Receptor Type 1 (CB1) antagonist, as a potent PIEZO1 inhibitor. Otenabant dose-dependently inhibited Ca2+ elevations mediated by endogenous or exogenously expressed human PIEZO1, but was ineffective against mouse Piezo1, revealing species-specific channel differences. Otenabant inhibited mechanosensitive currents elicited by shear stress in fibroblasts and by repeated poking in PIEZO1-expressing HEK-293 cells, altering the currents activation and inactivation kinetics, and prevented Yoda1-induced hyperpolarization in RBCs. Otenabant was able to reverse the negative impact of Yoda1 on RBC deformability. Conclusions and implicationsThese findings demonstrate the utility of Yoda-based screening for discovering PIEZO1 antagonists and identify Otenabant as a promising chemical scaffold for developing selective PIEZO1 inhibitors with therapeutic potential.

10
Integrated luminescence and phenotypic profiling for drug discovery in a zebrafish model of Marfan syndrome

Horvat, M.; Caboor, L.; De Rycke, K.; Mennens, L.; Daniels, E.; Wyseur, J.; Verhelst, E.; Roos, I.; Rodriguez-Rovira, I.; Egea, G.; De Backer, J.; Sips, P.

2026-05-13 pharmacology and toxicology 10.64898/2026.05.12.722859 medRxiv
Top 0.1%
4.3%
Show abstract

BackgroundMarfan syndrome (MFS) is a life-threatening heritable connective tissue disorder caused by pathogenic variants in fibrillin-1, characterized by progressive cardiovascular disease. Current medical therapies slow disease progression but do not prevent major complications, underscoring the need for new treatment strategies and unbiased discovery approaches. MethodsWe used a zebrafish model of MFS lacking fibrillin-3 (fbn3-/-), which recapitulates key cardiovascular phenotypes including cardiac stress, valvular defects, arrhythmia, and aortic dilation. To enable sensitive, quantitative assessment of cardiac stress, we generated a novel transgenic zebrafish reporter expressing secreted nanoluciferase under control of the stress-responsive nppb promoter. This reporter was combined with morphological phenotyping and bulbus arteriosus (BA) imaging. We evaluated standard MFS therapies, targeted modulators of TGF-{beta} signaling, and performed an unbiased high-throughput drug screen of over 1 500 clinically approved compounds across multiple developmental treatment windows. Resultsfbn3-/- larvae exhibited markedly elevated nppb activity that correlated with phenotypic severity and peaked during stages of highest mortality. The nanoluciferase reporter provided a [~]1 000-fold dynamic range, substantially outperforming Firefly luciferase-based assays. Pharmacological inhibition of TGF-{beta} signaling produced transient or deleterious effects, while {beta}-blockers, losartan, and allopurinol failed to consistently improve cardiac stress, pericardial edema, or BA dilation. The unbiased high-throughput drug screen identified a small number of primary and secondary hits; however, none demonstrated reproducible phenotypic rescue upon rigorous multi-dose, multi-time window validation. ConclusionsThis study establishes a sensitive zebrafish-based platform for early, quantitative assessment of cardiovascular stress in MFS. Our findings highlight the limited efficacy of current therapies, the context-dependent nature of TGF-{beta} modulation, and the biological complexity underlying MFS pathogenesis. Although no definitive therapeutic candidates were identified, this work lays a robust foundation for expanded unbiased discovery efforts aimed at identifying disease-modifying interventions for MFS.

11
Prioritizing embryos with lower homozygosity may reduce disease risk in children of related individuals undergoing preimplantation genetic testing

Wolfram, T.; Ahangari, M.; Davidson, I.; Wartschinski, L.; Li, J. H.; Eyre, M.; Stern, D.; Schleede, J.; Haghighi, A.; Carmi, S.; Christensen, M.

2026-06-04 genetic and genomic medicine 10.64898/2026.05.30.26354526 medRxiv
Top 0.1%
4.0%
Show abstract

Consanguinity is a reproductive union between individuals who share a recent common ancestor. These unions are common in many regions of the world and increase the burden of rare recessive disorders by elevating autozygosity in offspring. Current reproductive genetic screening focuses on a limited set of known pathogenic variants, leaving most recessive risk unaddressed. Here we argue that embryo-level autozygosity, quantified as the fraction of the genome in long runs of homozygosity (FROH), is a potentially actionable genomic biomarker that can be integrated into routine preimplantation genetic testing as a homozygosity-informed embryo-prioritization framework (PGT-H) that can be layered onto existing embryo biopsy workflows when couples are already undergoing IVF with PGT-A or PGT-M. Using forward simulations of first-cousin and double-first-cousin couples, we show that siblings conceived by the same couple span a wide range of FROH; selecting the lowest-FROH candidate from a cohort of five embryos reduces FROH by approximately 40% on average. Combining these reductions with empirical effect-size estimates, we estimate that for first-cousin couples this strategy could reduce risk of intellectual disability by roughly 35-45% (corresponding to an absolute risk reduction of about 1.8-2.2%) and potentially reduce excess recessive disease burden, while also modestly reducing risk of common diseases such as type 2 diabetes. We outline how existing PGT-A and PGT-M workflows could potentially be extended to report embryo-level FROH and discuss ethical and counseling considerations. Autozygosity-based embryo prioritization offers a principled way to address a component of recessive risk that current variant-centric approaches miss.

12
LANTERN: Leveraging Local Ancestry Tracts to Enhance Rare-Variant Aggregate Association Testing

Wang, Y.; Tuftin, B.; Raffield, L. M.; Hidalgo, B.; Kerns, S. L.; DeWan, A. T.; Leal, S. M.; Auer, P.

2026-04-27 genetic and genomic medicine 10.64898/2026.04.24.26351693 medRxiv
Top 0.1%
4.0%
Show abstract

Individuals with admixed ancestry comprise a significant proportion of populations of the Americas. Statistical methods have been developed to specifically leverage local ancestry inference to enhance the power and interpretability of genome-wide association studies in admixed populations. However, no such methods currently exist to test for rare-variant aggregate associations. Here we present LANTERN (Leveraging local ANcestry Tracts to Enhance Rare variaNt aggregate associations), a method that infers the alleles that lie on each ancestral haplotype and conducts rare-variant aggregate association testing in a generalized linear mixed model framework. Through simulation studies we demonstrated that LANTERN achieves proper control of Type 1 error while boosting power to detect associations when causal alleles predominately lie on one ancestral haplotype. Using data from a cohort of African American participants from the Jackson Heart Study, LANTERN identified two genes known to be involved in red-blood cell (RBC) biology when local ancestry information was incorporated. Specifically, a burden of rare alleles on European ancestral haplotypes in EPO was associated with both hemoglobin levels (HGB) and RBC counts, whereas a burden of rare alleles on African ancestral haplotypes in EPB42 was associated with HGB and RBC. In summary, LANTERN (i) allows for the identification of ancestry-specific rare-variant associations; and (ii) enhances rare-variant association signals compared to an analysis that ignores local ancestry. LANTERN is implemented in R and is freely available on GitHub.

13
Modeling rare coding variation on chromosome X provides insight into the genetics and differential sex prevalence of autism spectrum disorder

Satterstrom, F. K.; Jodeiry, K.; Mahjani, B.; Hatem, G.; Park, S. J.; Klei, L.; Fu, J. M.; Wigdor, E. M.; the Autism Sequencing Consortium, ; Betancur, C.; Daly, M. J.; Roeder, K.; Devlin, B.; Buxbaum, J. D.; Cutler, D. J.

2026-05-07 genetic and genomic medicine 10.64898/2026.05.04.26352380 medRxiv
Top 0.1%
3.7%
Show abstract

Autism spectrum disorder (ASD) is estimated to be up to four times as common in males as in females, yet the causes of this prevalence difference are not well established. One possible driver is genetic variation on the X chromosome, as it contains genes capable of contributing to ASD (e.g., PTCHD1, MECP2) and is known to play a role in genetic disorders with differential sex prevalence (e.g., color blindness). However, a lack of power compared to the autosomes combined with the complexities of modeling its biology have led to the X being largely overlooked in sequencing studies. Here, we develop quantitative X-linked TADA, a new model designed specifically for application to this chromosome, and use it to analyze rare variation from 50,663 individuals with ASD (and 136,670 individuals total). We find 9 genes on the X associated with ASD at a false discovery rate (FDR) < 0.05 and an additional 9 genes at FDR < 0.2, with many of these previously identified as involved in specific neurodevelopmental disorders. Point estimates of the liability conferred by de novo variants on the X are similar in females and males, with both sexes estimates elevated >20% above the corresponding autosomal values. We also develop a general theory of how X-linked variation of any additive or non-additive effect influences liability and describe its implications for prevalence. Using this theory and our empirical results, we show how genetic variation on the X could contribute to the sex-differential prevalence of ASD.

14
Comprehensive analysis of de novo variants across 2,497 orofacial cleft trios reveals novel genetic drivers of disease

Kurtas, N. E.; Sanchis-Juan, A.; Shin, E.; Curtis, S. W.; Robinson, K. R.; Lee, A. S.; Alade, A. A.; Zhao, X.; Fu, J.; Diaz Perez, K. K.; Gowans, J. J. L.; Eshete, M. A.; Adeyemo, W. L.; Buxo, C. J.; Padilla, C. D.; Poletta, F. A.; Carreno Torres, A.; Wehby, G. L.; Hecht, J. T.; Moreno Uribe, L. M.; Mukhopadhyay, N.; Shaffer, J. R.; Weinberg, S. M.; Murray, J. C.; Beaty, T. H.; Butali, A.; Talkowski, M.; Marazita, M. L.; Leslie-Clarkson, E. J.; Brand, H.

2026-05-24 genetic and genomic medicine 10.64898/2026.05.21.26352934 medRxiv
Top 0.1%
3.6%
Show abstract

Background Orofacial clefts (OFCs) and other palate abnormalities (PAs) are among the most common birth defects worldwide and are characterized by the abnormal formation of the lip and/or palate. Genetic studies have traditionally classified OFC cases as either syndromic, involving OFCs alongside other congenital anomalies, or nonsyndromic, which represent the majority of cases and occur in isolation. Emerging genomic evidence indicates that genes traditionally associated with syndromic forms of OFC can also harbor variants contributing to isolated cases, challenging the notion of a strict dichotomy between these categories and supporting their integration for gene discovery. Methods In this study, we applied multiple analytic approaches to characterize the genetic architecture of OFC and PAs by integrating genomic data from 2,497 trios with an OFC (n=2080) and PA (n=417) affected proband. We compared these findings across OFC subtypes and syndromic status with those from 5,515 control trios to identify enriched biological pathways and mechanisms and to prioritize candidate genes using variant burden testing. Results We observed a significant enrichment of de novo protein-truncating and damaging missense variants in cases compared to controls (OR = 2.17, p = 1.21x10-32), with particularly strong signals in biologically relevant gene sets involving OFC-associated, constrained, Mendelian disorder, and mouse candidate genes. Variant burden testing identified 39 OFC risk genes at FDR [&le;] 0.05, which we then integrated with 593 established OFC genes to interrogate the functional underpinnings of OFC via network analysis. This analysis revealed 309 high-order interactor genes not previously associated with OFC. Notably, this OFC network clustered into ten distinct biological pathways, with nucleosome-associated genes showing significant enrichment among cases in our cohort (OR = 14.8, p = 8.1x10-4). In a final integrative step, we combined evidence across all analyses to nominate 231 candidate genes, 32 of which contained at least two deleterious de novo variants in our cohort. Conclusions These findings underscore the value of integrating diverse OFC and PA subtypes, syndromic status, and variant classes to refine the genetic architecture of these disorders, highlighting both phenotypic expansion of known disease genes and the emergence of novel gene-phenotype associations.

15
Specific HLA Class I and II alleles are associated with a higher risk for tumor formation in Neurofibromatosis type 1

Sussman, J. H.; Brosius, S. N.; Gel, B.; Li, P.; Farrel, A.; Rokita, J. L.; Serra, E.; Tan, K.; Fisher, M. J.; Maris, J. M.; De Raedt, T.

2026-05-05 genetic and genomic medicine 10.64898/2026.05.04.26352173 medRxiv
Top 0.1%
3.6%
Show abstract

Neurofibromatosis type 1 (NF1) is a common autosomal dominant genetic tumor predisposition syndrome.1 NF1 patients display remarkable phenotypic variability, even within families carrying the same NF1 mutation.2 With few exceptions, the identification of specific genotype-phenotype correlations has remained elusive.3-6 We utilized RNA-seq data and direct DNA sequencing to determine HLA genotypes for individuals with NF1-associated high-grade glioma (HGG, n=25), low-grade glioma (LGG, n=79), and malignant peripheral nerve sheath tumors (MPNST, n=105). Odds ratios (OR), binomial p-values and false discovery values were calculated by comparing observed carrier frequencies against expected frequencies derived from ethnicity-matched population data. We find that specific HLA class I and II alleles are associated with different NF1 tumor types. For example, HLA-B*40:02 is significantly associated with NF1-MPNST (OR=3.71, p=0.001, Q=0.02), increasing the lifetime risks for MPNST from 10% to about 29%. The relative cancer risk for an individual in the general population carrying a risk allele can be high, however, that individuals absolute risk for cancer typically remains very low. In contrast, individuals that carry a risk allele and are also burdened with a tumor predisposition syndrome will have a substantially higher absolute risk to develop a tumor, simply because they start at a higher baseline susceptibility for tumors. The identification of HLA-risk alleles for NF1 tumor development is therefore important, as it will allow for a risk-adapted screening or more aggressive treatment of individuals with a specific HLA haplotype. If confirmed, this study will thus improve clinical care and potential outcomes of individuals with NF1.

16
Transcriptome-Wide Alternative Splicing Analysis Implicates Complex Events in Bipolar Disorder

Martinez-Jimenez, M.; Garcia-Ortiz, I.; Romero-Miguel, D.; Kavanagh, T.; Marshall, L. L.; Bello Sousa, R. A.; Sanchez Alonso, S.; Alvarez Garcia, R.; Benavente Lopez, S.; Di Stasio, E.; Schofield, P. R.; Baca-Garcia, E.; Mitchell, P. B.; Cooper, A. A.; Fullerton, J. M.; Toma, C.

2026-04-21 genetic and genomic medicine 10.64898/2026.04.19.26351209 medRxiv
Top 0.1%
3.6%
Show abstract

Alternative-splicing events (ASE) increase transcriptomic variability and play key roles in biological functions. The contribution of ASE to bipolar disorder (BD) remains largely unexplored. We performed a Transcriptome-Wide Alternative-Splicing Analysis (TWASA) to identify ASEs and genes potentially involved in BD. The study comprised 635 individuals: a discovery sample (DS) of 31 individuals from eight multiplex BD families (16 BD cases; 15 unaffected relatives), and a replication sample (RS) of 604 subjects (372 BD cases; 232 controls). Sequencing was conducted on RNA from lymphoblastoid cell lines (DS) and whole blood (RS). TWASA was performed using VAST-TOOLS (VT), rMATS (RM), and MAJIQ/MOCCASIN (MCC). Gene-set association analyses of genes containing ASEs were performed across six psychiatric disorders. Novel ASE (nASE) were investigated in the DS using FRASER. Limited gene overlap was observed across TWASA tools. MCC identified 2,031 complex ASEs involving 1,508 genes, showing the strongest genetic association with BD across psychiatric phenotypes. Prioritization of MCC-identified ASE genes yielded 441 candidates, including DOCK2 as top candidate from the DS. Replication was obtained for 98 genes, five with an identical ASE, and four (RBM26, QKI, ANKRD36, and TATDN2) showing a concordant percentage-spliced-in direction with the DS. Finally, 578 nASE were identified in the DS, with no evidence of familial segregation or differences in ASE types. This first TWASA in BD reveals tool-specific variability, complex ASE for genes specifically associated with BD, and novel candidate genes for BD. Alternative transcript isoform abundance may represent a mechanism contributing to BD pathophysiology.

17
Deriving LD-adjusted GWAS summary statistics through linkage disequilibrium deconvolution

Nouira, A.; Favre Moiron, M.; Tournaire, M.; Verbanck, M.

2026-04-11 genetic and genomic medicine 10.64898/2026.04.10.26350574 medRxiv
Top 0.1%
3.6%
Show abstract

Genome-wide association studies (GWAS) have identified numerous genetic variants associated with complex traits. However, linkage disequilibrium (LD) confounds these associations, leading to false positives where non-causal variants appear associated because they are correlated with nearby causal variants. This is particularly the case in highly polygenic traits where the genome can be saturated in causal variants. To address this issue, we propose LDeconv a method based on truncated singular value decomposition (SVD) that adjust GWAS summary statistics without requiring individual-level genotype data. This approach accounts for LD structure, isolates causal variants in high-LD regions, and improve the reliability of effect size estimates. We assess its performance through simulations across various LD scenarios, conduct extensive sensitivity analyses, and apply them to real GWAS data from the UK Biobank. Our results demonstrate that LDeconv effectively reduces false discoveries while preserving true associations, offering a robust framework for post-GWAS analysis.

18
Identifying disease-causing mechanisms and fundamental biology of neuromuscular disorder genes through genomic feature analysis

Martin, A.; Llanes-Cuesta, M. A.; Hartley, J. N.; Frosk, P.; Drogemoller, B. I.; Wright, G. E. B.

2026-04-22 genetics 10.64898/2026.04.21.719902 medRxiv
Top 0.1%
3.6%
Show abstract

IntroductionNeuromuscular disorders (NMDs) encompass a broad group of conditions that primarily affect the peripheral nervous system. They are often caused by genetic alterations that impair skeletal muscle function and result in debilitating symptoms. Obtaining an accurate molecular diagnosis remains a challenge, potentially because variants in genes that have yet to be identified as causal. We therefore used advanced computational methods to study the genetic architecture of NMDs and to identify key features that distinguish NMD genes from other genes in the broader genome. MethodsCurated genes implicated in NMDs (n = 639; GeneTable of NMDs) were obtained and merged with a comprehensive set of genomic features for human autosomal protein-coding genes. Machine-learning-based feature selection and ranking were performed using Boruta, along with complementary analytical approaches. These analyses were used to identify the most important genic features (n = 134, subcategories: gene complexity, genetic variation, expression patterns, and other general gene traits) for discriminating NMD genes from other genes in the genome ResultsNMD genes exhibit enriched expression in disease-relevant tissues, including skeletal muscle and heart. Additionally, compared with other protein-coding genes, these genes exhibit increased transcriptomic complexity (e.g., longer transcripts and more unique isoforms), contain more short tandem repeats, and show greater variation in conservation across model organisms. ConclusionsThis study identified several key genomic features that may distinguish NMD genes from the rest of the genome. This may enhance the identification of novel causal genes and could ultimately facilitate earlier diagnosis and medical management for affected individuals.

19
Mutation-Induced Pocket Deactivation: How Ser353/Pro245 Alters KCa2.2 vs KCa3.1 Ligand Selectivity

Gozzi, M.; Massa, J.; Koch, O.

2026-05-06 pharmacology and toxicology 10.64898/2026.05.03.722491 medRxiv
Top 0.1%
3.5%
Show abstract

The KCa2.2 and KCa3.1 channels are fundamental regulator of cellular K+ concentration, and promising target to treat diseases such as spinocerebellar ataxia and cancer. To fully exploit their therapeutic potential, and to continue studying their pathophysiological role, it is crucial to develop selective modulators for each of these two channels. Here we present a computational study to identify the molecular determinants behind the selectivity of two recently reported KCa2.2 modulators. We leveraged a protocol combining in silico mutagenesis, molecular dynamics simulations, and protein-ligand docking to analyse the pockets targeted by these ligands. We identified a Ser353/Pro245 substitution to be the main driver of the distinct pocket shapes in KCa2.2 and KCa3.1 channels, ultimately defining modulator selectivity. This approach provides novel insights into the structural differences of this binding site across potassium channel subtypes, shedding light on the selectivity determinants of modulators targeting this pocket.

20
Contextualizing the Utility of Polygenic Risk Scores using Absolute Risk Models in Diverse Ancestry Populations

Chatterjee, N.; Martina, F.; Kachuri, L.; Natarajan, P.; Witte, J.; Huo, D.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354842 medRxiv
Top 0.1%
3.5%
Show abstract

Polygenic risk scores (PRSs) are emerging as powerful tools for quantifying inherited risk for common diseases and, in some cases, are approaching clinical implementation. A major concern for PRS implementation is their limited accuracy in non-European populations, particularly in those of African ancestry. However, past evaluations have focused on metrics such as relative risk or AUC, which do not capture background risk arising from contextual factors. We introduce a novel measure of variable importance, the conditional average derivative estimator (CADE), to evaluate PRS utility across diverse contexts and populations within absolute risk models that integrate PRSs with other relevant risk factors. We illustrate this framework by integrating PRSs for breast and prostate cancer within age-specific absolute risk models for incidence and mortality fit using individual-level data from the All of Us Research Program with inputs from the National Cancer Institute SEER cancer registry. Our projections show that although the PRSs are known to have the lowest discriminatory accuracy in African Americans (AA), there are contexts in which they provide greater utility, such as for the stratification of prostate cancer risk and mortality, where the CADE values for AA were 2- and 7-fold higher than for European Americans. These findings suggest that conclusions about the limited clinical utility of PRS in non-European populations may be premature and underscore the need to quantify PRS risk-stratification utility at the absolute-risk level, while accounting for disease onset, survival, and broader health and economic factors.